rank | frequency | n-gram |
---|---|---|
1 | 162639 | -n |
2 | 111968 | -e |
3 | 73888 | -r |
4 | 71327 | -s |
5 | 58365 | -t |
rank | frequency | n-gram |
---|---|---|
1 | 118173 | -en |
2 | 61972 | -er |
3 | 25211 | -ng |
4 | 23540 | -te |
5 | 17852 | -es |
rank | frequency | n-gram |
---|---|---|
1 | 26535 | -ten |
2 | 20050 | -ung |
3 | 18729 | -gen |
4 | 11930 | -ter |
5 | 11831 | -hen |
rank | frequency | n-gram |
---|---|---|
1 | 9964 | -chen |
2 | 9375 | -ngen |
3 | 4985 | -cher |
4 | 4790 | -nden |
5 | 4525 | -tion |
rank | frequency | n-gram |
---|---|---|
1 | 7523 | -ungen |
2 | 4137 | -schen |
3 | 2887 | -erung |
4 | 2821 | -ische |
5 | 2776 | -enden |
The tables show the most frequent letter-N-grams at the ending of words for N=1…5. Everything runs in parallel to 2.2.5 Most frequent word beginnings. The aim is suffix detection instead of affix detection.
For N=3:
SELECT @pos:=(@pos+1), xx.* from (SELECT @pos:=0) r, (select count(*) as cnt ,concat("-", right(word,3)) FROM words WHERE w_id>100 group by right(word,3) order by cnt desc) xx limit 5;
2.2.5 Most frequent word beginnings